feat: Gaussian sketching + sketched SVD by lkdvos · Pull Request #225 · QuantumKitHub/MatrixAlgebraKit.jl

lkdvos · 2026-05-08T18:58:50Z

This PR implements infrastructure for working with matrix sketches, in particular through Gaussian sampling.
The main additions are left_sketch! and right_sketch!, which could be thought of as left_orth with a sketching truncation strategy (although not implemented that way).

Additionally, this brings the CUSOLVER_Randomized algorithm implementation on equal footing by using a Driver to select between the CUSOLVER implementation and the Native one.

Sketching API

left_sketch[!](A; howmany, kwargs...) -> (Q, B) with Q an m×k isometry and B = Q'·A.
right_sketch[!](A; howmany, kwargs...) -> (B, Pᴴ) with Pᴴ a k×n co-isometry and B = A·Pᴴ'.
abstract type SketchingStrategy <: AbstractAlgorithm — supertype for sketching primitives.
GaussianSketching(howmany; numiter=2, rng=default_rng()) — the one concrete strategy currently shipped.

Sketched SVD

SketchedAlgorithm{A,S,T,D} — @kwdef wrapper bundling alg (small dense factorization), sketch (SketchingStrategy), trunc (TruncationStrategy), and driver (Driver), to denote decomposing via sketch + decompose + truncate. Mimics TruncatedAlgorithm.
svd_trunc[!] and svd_trunc_no_error[!] now dispatch on SketchedAlgorithm.
They allocate (U, S, Vᴴ) sized for the sketch (k = sketch.howmany), check shapes, then call gesvdr!(driver, A, S, U, Vᴴ; sketch, alg, trunc).
Truncation error: ε = sqrt((‖A‖+‖S‖)(‖A‖−‖S‖)), identical to the old CUSOLVER_Randomized path.
The CUDA extension now implements gesvdr!(::CUSOLVER, A, S, U, Vᴴ; sketch, trunc, alg=nothing) directly. It currently only accepts sketch::GaussianSketching and trunc::TruncationByOrder (i.e. truncrank) and ignores the inner alg. Requires dedicated overloads for the allocation of the output because of CUSOLVER conventions.
CUSOLVER_Randomized(; k, p, niters) reimplemented as SketchedAlgorithm(; sketch=GaussianSketching(k+p; numiter=niters+1), truncrank(k); driver=CUSOLVER()). Non-breaking deprecation

Design choices/questions

1. `left_sketch!` vs `left_orth!`

I had a hard time deciding between overloading left_orth! directly or introducing a dedicated left_sketch function. For simplicity I kept it separate for now, mostly since this is easier to reason through while fleshing out the design, but there might be a reasonable point to be made to merge the two.

2. Sketching as an algorithm vs Sketching as a truncation strategy

In the svd_trunc implementation, I chose to have both trunc and sketch as a keyword argument, rather than trying to fit this through the same keyword. Again, this is mostly for convenience while I was playing around with this, but we could conceive of a way to merge the two concepts, especially if we decide left_orth!(A; trunc=gaussiansketch(...)) is a reasonable approach.

3. Allocating outputs

One of the things I struggled the most with is the issue of deciding what initialize_output(svd_trunc!, A, ::SketchedAlgorithm) should return.
The main issue is that we already are abusing this a bit for the ::TruncatedAlgorithm codepath, where by convention the USVh we pass in is the one that will be used for the svd_compact! inside, and is not equivalent to the one that is outputted.
For ::SketchedAlgorithm, I therefore chose to pass in the sketched sizes, since that is some workspace that can be reused, and that has similar semantics, but it is definitely true that this is a bit unintuitive.

However, the biggest struggle in changing this is that our automatic differentiation implementations hinge on the fact that everything passes through svd_trunc!(A, out, alg) in the end, so simply not supporting passing in an out argument has quite some implications for the remainder of the code.

This is probably a design choice that we could consider revisiting for a v0.7 version of MatrixAlgebraKit, since it might just be that for these cases where the output size is hard to determine beforehand, it doesn't necessarily make too much sense to allow passing it in as inputs.
The alternative would be to start playing around with "reallocation" strategies, where we shrink/expand provided inputs if they don't have the correct sizes. This does however feel like it might just be a bit too convoluted for what it buys us.

4. Truncation error formula

ε = sqrt((‖A‖ + ‖S‖) * (‖A‖ − ‖S‖))

This is the ‖A‖_F^2 − ‖S‖_F^2 identity, written for numerical stability.
However, I had to significantly lower the tolerances for any such tests, since the numerical accuracy of the sketching and the floating point errors are quite finicky for this.
This is of course the same discussion we had before, where it is not clear to me that svd_trunc really should have the responsibility to provide truncation errors, and svd_trunc_no_error is a way more sensible default mode.
Presumably this is fine as long as it is clearly documented?

5. Gauge fixing depends on the exact truncation path

In the current implementation, the gauge-fixing happens at the moment of decomposing the inner small core tensor, which will then still get unprojected onto a larger space through the sketch.
While it is somewhat straightforward to change this, it might be confusing that there are gaugefix keywords at multiple places, which don't have the same effect.
I'm not sure what the best way forwards is for this, but this might require some more discussion.

TODO

Documentation page
Resolve discussions above
Testing in the context of AD
Testing in the context of GPU

Jutho · 2026-05-10T08:33:02Z

+    m, n = size(A); minmn = min(m, n)
+    k = trunc.howmany
+    1 ≤ k ≤ minmn ||
+        throw(ArgumentError("trunc.howmany=$k must satisfy 1 ≤ k ≤ min(size(A))=$minmn"))


Do we generically throw an error if truncrank is bigger than maximal size? Would it not be more useful if we consider truncrank as an upper bound?

You are definitely right, I think I was still dealing with the fact that CUSOLVER doesn't accept that by default.
There is an additional question about if we should catch that and then not do any sketching at all, since in that case we might as well just do the compact decomposition, but I'm fine with leaving that for a future optimization.

This does raise the question if we should error for the case where the trunc.howmany is larger than the sketch.howmany.
The cuSOLVER interface does not have this problem, since they use k = trunc.howmany and p = sketch.howmany - trunc.howmany and require both to be positive, and it does seem like it is sensible to not allow people to specify an algorithm where the truncation rank is higher than the sketching rank?

codecov · 2026-05-10T13:09:12Z

Codecov Report

❌ Patch coverage is 78.08989% with 39 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/implementations/svd.jl	75.86%	14 Missing ⚠️
src/algorithms.jl	59.09%	9 Missing ⚠️
src/interface/sketching.jl	53.33%	7 Missing ⚠️
src/implementations/sketching.jl	88.00%	6 Missing ⚠️
src/implementations/orthnull.jl	0.00%	2 Missing ⚠️
...MatrixAlgebraKitCUDAExt/MatrixAlgebraKitCUDAExt.jl	91.66%	1 Missing ⚠️

Files with missing lines	Coverage Δ
ext/MatrixAlgebraKitCUDAExt/yacusolver.jl	`96.45% <100.00%> (+0.33%)`	⬆️
src/MatrixAlgebraKit.jl	`100.00% <ø> (ø)`
src/interface/decompositions.jl	`80.00% <ø> (+3.80%)`	⬆️
src/interface/svd.jl	`76.19% <100.00%> (+3.96%)`	⬆️
src/yalapack.jl	`85.68% <100.00%> (ø)`
...MatrixAlgebraKitCUDAExt/MatrixAlgebraKitCUDAExt.jl	`76.47% <91.66%> (+2.78%)`	⬆️
src/implementations/orthnull.jl	`69.11% <0.00%> (ø)`
src/implementations/sketching.jl	`88.00% <88.00%> (ø)`
src/interface/sketching.jl	`53.33% <53.33%> (ø)`
src/algorithms.jl	`83.81% <59.09%> (-2.09%)`	⬇️
... and 1 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lkdvos added 6 commits May 8, 2026 13:32

Add SketchingStrategy

60edde4

implement SketchedAlgorithm for SVD

ec55db2

Rework algorithm selection logic

1cd48c0

refactor + deprecate CUSOLVER randomized SVD

d111cd9

some more test updates

e679407

revert some unintended changes

213fe3c

Jutho reviewed May 10, 2026

View reviewed changes

Comment thread ext/MatrixAlgebraKitCUDAExt/MatrixAlgebraKitCUDAExt.jl Outdated

lkdvos added 2 commits May 10, 2026 08:55

Some GPU fixes

ffbfa85

Fix method ambiguity

ff439bd

lkdvos force-pushed the ld-sketch branch from bf03431 to ff439bd Compare May 10, 2026 12:55

QuantumKitHub deleted a comment from github-actions Bot May 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Gaussian sketching + sketched SVD#225

feat: Gaussian sketching + sketched SVD#225
lkdvos wants to merge 8 commits intomainfrom
ld-sketch

lkdvos commented May 8, 2026

Uh oh!

Jutho May 10, 2026

Uh oh!

lkdvos May 10, 2026

Uh oh!

Uh oh!

codecov Bot commented May 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lkdvos commented May 8, 2026

Sketching API

Sketched SVD

Design choices/questions

1. left_sketch! vs left_orth!

2. Sketching as an algorithm vs Sketching as a truncation strategy

3. Allocating outputs

4. Truncation error formula

5. Gauge fixing depends on the exact truncation path

TODO

Uh oh!

Jutho May 10, 2026

Choose a reason for hiding this comment

Uh oh!

lkdvos May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `left_sketch!` vs `left_orth!`

codecov Bot commented May 10, 2026 •

edited

Loading